| Oracle® Enterprise Manager System Monitoring Plug-In Installation Guide for Oracle Engineered System Healthchecks Release 12.1.0.3.0 Part Number E27420-02 |
|
|
PDF · Mobi · ePub |
System Monitoring Plug-In Installation Guide for Oracle Engineered System Healthchecks
Release 12.1.0.3.0
E27420-02
November 2012
The Oracle Engineered System Healthchecks plug-in processes the XML output from the Exachk tool, which is included as part of Oracle Enterprise Manager system monitoring. The Exachk tool provides functionality for system administrators to automate the assessment of Exadata V2, X2-2, X2-8, and Exalogic systems for known configuration problems and best practices.
This document covers the following topics:
The Oracle Engineered System Healthchecks plug-in bundle consists of the following files:
Healthchecks Plug-in XML parser utility: A set of Perl scripts that parse the resulting XML file generated from the Exachk tool and returns the output in em_result format. This format can be processed by the Enterprise Manager Metric Engine.
A metadata XML file: defines the new Target Type and Metrics.
A Default Collections XML file: defines the schedule (that is, the time interval for the data collection of metrics). It also contains the logic for raising the alerts depending on results from the Exachk tool, and the messages to be shown for alerts.
A Report SQL file: defines a new Report, which will be created during the jar deployment.
A Messages dlf file: contains the risk, recommendation, benefit, and fail message to be displayed in case of a healthcheck failure.
The mpcui files: used for home page customization to define the content of the home page and the menus provided on this page.
The following prerequisites must be met before you can deploy the plug-in:
Review Oracle Exadata Best Practices (MOS Note 757552.1) and Oracle Database Machine Monitoring Best Practices (MOS Note 1110675.1).
These documents provide a collection of articles related to best practices for the deployment of Oracle Database Machine and Exadata Storage Server.
Verify and enable the Exachk tool. This version of the plug-in supports Exachk 2.1.5 and above.
For more information on enabling the Exachk tool, refer to Run the Exachk Tool Automatically on Exadata and Run the Exachk Tool Automatically on Exalogic.
Operating Systems:
Oracle Linux release 5.3 or later.
Oracle Solaris 11 or later.
See the Plug-in Manager chapter in the Oracle Enterprise Manager Cloud Control Administrator's Guide for steps to deploy the plug-in:
http://docs.oracle.com/cd/E24628_01/doc.121/e24473/plugin_mngr.htm
After successfully deploying the plug-in, follow these steps to add the plug-in target to Cloud Control for central monitoring and management:
Log in to Enterprise Manager Cloud Control.
Click Setup, then Add Targets, and finally Add Targets Manually.
Select Add Non-Host Targets by Specifying Target Monitoring Properties. From the Target Type drop-down, select the Oracle Engineered System Healthchecks target type. Click Add Manually.
Select a management agent to host the new target.
Specify the name for the target instance, and set the following values for the instance properties:
Exachk Results Directory = <specify the path of the exachk output directory> Max interval allowed (in days) between consecutive Exachk runs = 31 (This property will be defaulted to 31 days if unset.)
Where 31 is the default, auto-filled value of days between consecutive runs of the Exachk tool. You can leave this value as is to use the default of 31 days; otherwise, you can change the value to the customized time span you need (such as 20, 45, 60, etc.).
Click Test Connection to make sure the parameters you entered (such as the password) are correct. If the test was successful, proceed with adding targets.
Note:
After you deploy and configure the plug-in to monitor one or more targets in the environment, you can customize the monitoring settings of the plug-in. This alters the collection intervals and threshold settings of the metrics to meet the particular needs of your environment. If you decide to disable one or more metric collections, this could impact the reports that the metric is a part of.The following environment variables should be set up before executing exachk (all three modes: -a, -s, -S).
The environment variable RAT_COPY_EM_XML_FILES should be set. Setting this environment variable will also enable copying of results files on all the nodes in the cluster:
export RAT_COPY_EM_XML_FILES=1
Specify the exachk output path. The result files will be copied to this location on all the nodes in the cluster:
export RAT_OUTPUT = [exachk output directory]
Note:
The exachk output directory should exist on all nodes of the cluster.The Exachk tool bundle can be downloaded from My Oracle Support (See Doc 1070954.1).
The Exachk tool should be run in "silent" mode using the -S option. Refer to Frequency of Running the Exachk Tool to schedule the Exachk execution.
To use the -S option, you must configure the SSH user equivalence for the RDBMS software owner (for example, oracle) from the execution database server to the oracle user on all other database servers (mandatory). To validate a proper SSH user equivalence configuration, log in as the RDBMS software owner (for example, oracle) and execute the following commands:
$ ssh -o NumberOfPasswordPrompts=0 -o StrictHostKeyChecking=no -l oracle dbServerName "echo \"oracle user equivalence is setup correctly\""
Where oracle is the RDBMS software owner and dbServerName is the database server hostname. Repeat for each database server.
If "Permission denied (publickey,gssapi-with-mic,password)" is returned, then the SSH user equivalence is not properly configured.
Note:
In the current version of the Exachk tool regardless of option (-s or -S), Storage Server Checks and InfiniBand Switch checks will not be performed.
A future release of the Exachk tool may provide silent support for InfiniBand switches.
The Exachk tool bundle can be downloaded from My Oracle Support (See Doc 1449226.1):
https://support.oracle.com
Run the Exachk tool as root using the -S option to run it in "silent" mode.
For limitations and more details on running Exachk on Exalogic, refer to the MOS note 1449226.1.
For configuring Oracle Engineered System Health checks plug-in with the Exalogic virtualized configuration, the Enterprise Manager management agent selected must be running on the Exalogic Enterprise Controller vServer. If an agent is not deployed to the Exalogic Enterprise Controller vServer, refer to the instructions in the Installing Oracle Management Agent chapter of the Oracle Enterprise Manager Cloud Control Basic Installation Guide to deploy an agent:
http://docs.oracle.com/cd/E24628_01/install.121/e22624/install_agent.htm#CACJEFJI
Additionally, make sure that the share with the Exachk tool is mounted with read-only permissions on Exalogic Enterprise Controller vServer. In the event the share is not mounted, mount it and make sure that the mount has a common read-only group permission with the Oracle user of the vServer, thus enabling read-only access (this operation can be performed as root):
mount -t nfs -r <Storage appliance ip>:<path> <path>
For example:
mkdir -p /mnt/common/general/ mount -t nfs -r 192.168.10.15:/export/common/general /mnt/common/general/
Note:
The-r option is for "read only" permissions.Once configured, network communication is standardized to communicate to the Enterprise Manager OMS server.
Essentially, the Agent running in the Enterprise Controller vServer communicates to the OHS proxy via the IPoIB-virt-admin network, and the OHS proxy in turn communicates to the Enterprise Manager OMS server via the EoIB-external-mgmt network.
The Exachk tool should be run in the following scenarios:
Once a month on a regular basis.
After taking corrective actions for the failures reported by the Exachk tool.
The checks executed by the Exachk tool are placed under the metric ExadataResults.
You can also evaluate the metrics any time by running the following command:
./emctl control agent runCollection <targetName>:oracle_exadata_hc ExadataResults
A healthcheck results report, which will be visible in the target's home page, lists all metrics along with the failed checks, irrespective of whether the alert is raised or not.
You can also view the report in a printable view using the menu from the target home page:
On the target home page click Targets, then Information Publisher Reports.
Click Exachk Execution Results.
Click Continue.
You will be directed to the reports screen where you can filter the results on the metric type. Figure 1 shows an example of the report results visible on the target homepage.
This Report is bundled along with the plug-in and will be created in the OMS during deployment.
.csv FileTo download the Exachk Execution Results report as a .csv file:
On the target home page click Oracle Engineered System Healthchecks then Information Publisher Reports.
Click the Exachk Execution Results link. Click Continue.
On the Reports page, you can filter the results by metric type. In the Exachk Execution Results table region, click the icon present in the top-right corner to download the report as a .csv file.
By default the plug-in is designed to raise alerts for the following checks only:
Verify Disk Cache Policy on database server.
Verify database server disk controllers use writeback cache.
Verify RAID Controller Battery Condition (database server).
Verify RAID Controller Battery Temperature (database server).
Verify Database Server Virtual Drive Configuration.
Verify Database Server Physical Drive Configuration.
Imagehistory version comparison across system.
Verify Hardware and Firmware on Database and Storage Servers (CheckHWnFWProfile) [Database Server].
Verify Software on Storage Servers (CheckSWProfile.sh).
NFS Mount Point - Attribute Caching.
/conf/configvalid File.
Backend Check.
Exachk not running.
Results and Exception file(s) missing.
Metric Parsing Failed.
To change the behavior of alerts, you can apply the Oracle provided Engineered System Healthchecks No Alert template to disable alerts for all checks executed by the Exachk tool.
Follow the steps below to apply the Oracle-provided templates:
Log in to Enterprise Manager Cloud Control.
Click Enterprise from the upper-left-corner of the Cloud Control home page. Click Monitoring from the drop-down menu, then Monitoring Templates.
Click the check box to enable search to "Display Oracle provided templates and Oracle Certified templates." Search for Oracle provided Engineered System Healthchecks No Alert template.
Select the radio button next to the template name and click Apply.
In the Apply Monitoring Template screen, click Add and select the targets on which you wish to apply this template. Click OK.
Note:
You can enable the alerts by applying the Oracle provided Engineered System Healthchecks template, following the same steps as above. Applying this template will enable alerts for the check names specified in Alerts sectionTo define your own monitoring template and enable alerts for checks, follow the steps below:
Identify the Check ID of the check you want to be alerted of. The Check ID is used to uniquely identify each check. For the check name you want to set up an alert, get the corresponding Check ID from the Exachk Execution Results report.
Click Enterprise from the upper-left-corner of the Cloud Control home page, then click Monitoring from the drop-down menu, then select Monitoring Templates.
Click the Create button. In the next screen, you will be prompted to select a target instance for which to create the template for. Select the Oracle Engineered System Healthchecks target instance that you have set up.
In the General tab of the Create Monitoring Template screen, enter a template name and description.
In the Metric Threshold tab of the Create Monitoring Template screen, click edit (pencil icon) next to the status column of Engineered System Healthchecks metric in the tree structure displayed. This will redirect to the Edit Advance Settings: Status page.
Click the Add button on the top right corner of the Create Monitoring Template screen. In the text box that appears, enter the Check ID that you want to be alerted for Check ID. Enter FAIL as the critical threshold.
Repeat for all Check IDs that you want alerts to be enabled.
Click Continue and then OK to close the Edit Advance settings: Status page.
In the Monitoring Templates screen, select the radio button next to the template you created and click Apply.
In the Apply Monitoring Template screen, click Add and select the targets on which you wish to apply this template. Click OK.
To disable a specific alert, follow the steps below:
Select the Metric and Policy Settings under the Monitoring menu in the target home page.
In this page, update the value for the Critical Threshold column to blank for the alert that you wish to disable and click OK.
The following are known issues for Oracle Engineered System Healthchecks plug-in Release 12.1.0.3.0:
Upgrading to 12.1.0.3.0 Causes an Error Message to Display
After the plug-in on the OMS is upgraded to 12.1.0.3.0 from an earlier version, an error message is displayed on the target homepage. The error is caused because of caching, and the error message explains the same.
To work around this issue, clear the browser cache and reload the page.
See the Plug-in Manager chapter in the Oracle Enterprise Manager Cloud Control Administrator's Guide for steps to undeploy the plug-in:
http://docs.oracle.com/cd/E24628_01/doc.121/e24473/plugin_mngr.htm
For information about Oracle's commitment to accessibility, visit the Oracle Accessibility Program website at http://www.oracle.com/pls/topic/lookup?ctx=acc&id=docacc.
Oracle customers have access to electronic support through My Oracle Support. For information, visit http://www.oracle.com/pls/topic/lookup?ctx=acc&id=info or visit http://www.oracle.com/pls/topic/lookup?ctx=acc&id=trs if you are hearing impaired.
System Monitoring Plug-In Installation Guide for Oracle Engineered System Healthchecks, Release 12.1.0.3.0
E27420-02
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited.
The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing.
If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, the following notice is applicable:
U.S. GOVERNMENT RIGHTS Programs, software, databases, and related documentation and technical data delivered to U.S. Government customers are "commercial computer software" or "commercial technical data" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, the use, duplication, disclosure, modification, and adaptation shall be subject to the restrictions and license terms set forth in the applicable Government contract, and, to the extent applicable by the terms of the Government contract, the additional rights set forth in FAR 52.227-19, Commercial Computer Software License (December 2007). Oracle America, Inc., 500 Oracle Parkway, Redwood City, CA 94065.
This software or hardware is developed for general use in a variety of information management applications. It is not developed or intended for use in any inherently dangerous applications, including applications that may create a risk of personal injury. If you use this software or hardware in dangerous applications, then you shall be responsible to take all appropriate fail-safe, backup, redundancy, and other measures to ensure its safe use. Oracle Corporation and its affiliates disclaim any liability for any damages caused by use of this software or hardware in dangerous applications.
Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.
Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. AMD, Opteron, the AMD logo, and the AMD Opteron logo are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered trademark of The Open Group.
This software or hardware and documentation may provide access to or information on content, products, and services from third parties. Oracle Corporation and its affiliates are not responsible for and expressly disclaim all warranties of any kind with respect to third-party content, products, and services. Oracle Corporation and its affiliates will not be responsible for any loss, costs, or damages incurred due to your access to or use of third-party content, products, or services.